A Conformal Classifier for Dissimilarity Data

نویسندگان

  • Frank-Michael Schleif
  • Xibin Zhu
  • Barbara Hammer
چکیده

Current classification algorithms focus on vectorial data, given in euclidean or kernel spaces. Many real world data, like biological sequences are not vectorial and often non-euclidean, given by (dis-)similarities only, requesting for efficient and interpretable models. Current classifiers for such data require complex transformations and provide only crisp classification without any measure of confidence, which is a standard requirement in the life sciences. In this paper we propose a prototype-based conformal classifier for dissimilarity data. It effectively deals with dissimilarity data. The model complexity is automatically adjusted and confidence measures are provided. In experiments on dissimilarity data we investigate the effectiveness with respect to accuracy and model complexity in comparison to different state of the art classifiers.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Adaptive conformal semi-supervised vector quantization for dissimilarity data

Semi-Supervised Learning Proximity Data Dissimilarity Data Conformal Prediction Generalized Learning Vector Quantization Existing semi-supervised learning algorithms focus on vectorial data given in Euclidean space. But many real life data are non-metric, given as (dis-)similarities which are not widely addressed. We propose a conformal prototype-based classifier for dissimilarity data to semi-...

متن کامل

Adaptive prototype-based dissimilarity learning

In this thesis we focus on prototype-based learning techniques, namely three unsupervised techniques: generative topographic mapping (GTM), neural gas (NG) and affinity propagation (AP), and two supervised techniques: generalized learning vector quantization (GLVQ) and robust soft learning vector quantization (RSLVQ). We extend their abilities with respect to the following central aspects: • Ap...

متن کامل

Secure Semi-supervised Vector Quantization for Dissimilarity Data

The amount and complexity of data increase rapidly, however, due to time and cost constrains, only few of them are fully labeled. In this context non-vectorial relational data given by pairwise (dis)similarities without explicit vectorial representation, like score-values in sequences alignments, are particularly challenging. Existing semi-supervised learning (SSL) algorithms focus on vectorial...

متن کامل

An association-based dissimilarity measure for categorical data

In this paper, we propose a novel method to measure the dissimilarity of categorical data. The key idea is to consider the dissimilarity between two categorical values of an attribute as a combination of dissimilarities between the conditional probability distributions of other attributes given these two values. Experiments with real data show that our dissimilarity estimation method improves t...

متن کامل

Spatial Representation of Dissimilarity Data via Lower-Complexity Linear and Nonlinear Mappings

Dissimilarity representations are of interest when it is hard to define well-discriminating features for the raw measurements. For an exploration of such data, the techniques of multidimensional scaling (MDS) can be used. Given a symmetric dissimilarity matrix, they find a lower-dimensional configuration such that the distances are preserved. Here, Sammon nonlinear mapping is considered. In gen...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012